WAD-CMSN: Wasserstein distance-based cross-modal semantic network for zero-shot sketch-based image retrieval
نویسندگان
چکیده
Zero-shot sketch-based image retrieval (ZSSBIR) aims at retrieving natural images given free hand-drawn sketches that may not appear during training. Previous approaches used semantic aligned sketch-image pairs or utilized memory expensive fusion layer for projecting the visual information to a low-dimensional subspace, which ignores significant heterogeneous cross-domain discrepancy between highly abstract sketch and relevant image. This yield poor performance in training phase. To tackle this issue overcome drawback, we propose Wasserstein distance-based cross-modal network (WAD-CMSN) ZSSBIR. Specifically, it first projects of each branch (sketch, image) common subspace via distance an adversarial manner. Furthermore, novel identity matching loss is employed select useful features, can only capture complete knowledge, but also alleviate over-fitting phenomenon caused by WAD-CMSN model. Experimental results on challenging Sketchy (Extended) TU-Berlin datasets indicate effectiveness proposed model over several competitors.
منابع مشابه
Zero-Shot Sketch-Image Hashing
Recent studies show that large-scale sketch-based image retrieval (SBIR) can be efficiently tackled by cross-modal binary representation learning methods, where Hamming distance matching significantly speeds up the process of similarity search. Providing training and test data subjected to a fixed set of pre-defined categories, the cutting-edge SBIR and cross-modal hashing works obtain acceptab...
متن کاملA Radon-based Convolutional Neural Network for Medical Image Retrieval
Image classification and retrieval systems have gained more attention because of easier access to high-tech medical imaging. However, the lack of availability of large-scaled balanced labelled data in medicine is still a challenge. Simplicity, practicality, efficiency, and effectiveness are the main targets in medical domain. To achieve these goals, Radon transformation, which is a well-known t...
متن کاملAttribute-Guided Network for Cross-Modal Zero-Shot Hashing
Zero-Shot Hashing aims at learning a hashing model that is trained only by instances from seen categories but can generate well to those of unseen categories. Typically, it is achieved by utilizing a semantic embedding space to transfer knowledge from seen domain to unseen domain. Existing efforts mainly focus on single-modal retrieval task, especially Image-Based Image Retrieval (IBIR). Howeve...
متن کاملSketch Based Image Retrieval
Sketch based image retrieval is a task that has been explored a lot recently as an alternative method for image retrieval. We develop this task on The Sketchy Database, where we use Siamese and Triplet network to perform sketch based image retrieval. We employ deep residual learning network as the constituent network in the Siamese and Triplet architecture and use new data augmentation techniqu...
متن کاملSketch Based Image Retrieval
The content based image retrieval (CBIR) is one of the most common, increasing research areas of the digital image processing. Most of the existing image search tools, such as Google Images as well as Yahoo! Image search, are built on textual annotation of images. In these tools, images are physically annotated with keywords and then retrieved using text-based search methods. The presentations ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: International Journal of Wavelets, Multiresolution and Information Processing
سال: 2022
ISSN: ['0219-6913', '1793-690X']
DOI: https://doi.org/10.1142/s0219691322500540